AITopics

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
Asia > Middle East > Israel (0.05)

Industry: Health & Medicine (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.57)

Reimers, Felix Simon, Peters, Carl-Hendrik, Nichele, Stefano

Benchmarking the State of Networks with a Low-Cost Method Based on Reservoir Computing

arXiv.org Artificial IntelligenceSep-1-2025

Using data from mobile network utilization in Norway, we showcase the possibility of monitoring the state of communication and mobility networks with a non-invasive, low-cost method. This method transforms the network data into a model within the framework of reservoir computing and then measures the model's performance on proxy tasks. Experimentally, we show how the performance on these proxies relates to the state of the network. A key advantage of this approach is that it uses readily available data sets and leverages the reservoir computing framework for an inexpensive and largely agnostic method. Data from mobile network utilization is available in an anonymous, aggregated form with multiple snapshots per day. This data can be treated like a weighted network. Reservoir computing allows the use of weighted, but untrained networks as a machine learning tool. The network, initialized as a so-called echo state network (ESN), projects incoming signals into a higher dimensional space, on which a single trained layer operates. This consumes less energy than deep neural networks in which every weight of the network is trained. We use neuroscience inspired tasks and trained our ESN model to solve them. We then show how the performance depends on certain network configurations and also how it visibly decreases when perturbing the network. While this work serves as proof of concept, we believe it can be elevated to be used for near-real-time monitoring as well as the identification of possible weak spots of both mobile communication networks as well as transportation networks.

artificial intelligence, machine learning, node, (16 more...)

2508.2142

Country: Europe > Norway (0.36)

Genre: Research Report (0.82)

Industry:

Telecommunications (1.00)
Information Technology > Networks (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsAug-15-2025, 20:21:49 GMT

b090409688550f3cc93f4ed88ec6cafb-AuthorFeedback.pdf

deep model, sparse, sparsity, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Neural Information Processing SystemsAug-15-2025, 08:31:45 GMT

APPENDIX: In this section, we provide the details of our implementation and proofs for reproducibility

's hidden state by h Then we need to calculate the second part of Eq. Using the Bayes' theorem, we have: p In Section 4.3, we devise a Sigmoid function to adapt the γ during the supernet training, which is defined as: γ (t) = 1 Sigmoidnull ( t total epochs 2 1) b null, (19) Section 3.2 theoretically demonstrates the benefit of the proposed architecture complementation loss function,

architecture, implementation and proof, node, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Neural Information Processing SystemsAug-15-2025, 05:14:07 GMT

A Bayesian Inference over Neural Networks On a supervised model parameterized by W, we seek to infer the conditional distribution W | D

The prior and likelihood are both modelling choices. A.1 Likelihoods for BNNs The likelihood is purely a function of the model prediction Φ As exact posterior inference via (11) is intractable, we instead rely on approximate inference algorithms, which can be broadly grouped into two classes based on their method of approximation. A concrete label can be obtained by choosing the class with highest output value. The Gaussian variational family is a common choice. Estimators for the integral in (15) are necessary.

dataset, inference, initial learning rate, (14 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Middle East > Israel (0.04)

Industry:

Health & Medicine (0.94)
Banking & Finance (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)

Perin, Andrea, Lagomarsini, Giacomo, Gallicchio, Claudio, Nuti, Giuseppe

Mixture of Raytraced Experts

arXiv.org Artificial IntelligenceJul-17-2025

We introduce a Mixture of Raytraced Experts, a stacked Mixture of Experts (MoE) architecture which can dynamically select sequences of experts, producing computational graphs of variable width and depth. Existing MoE architectures generally require a fixed amount of computation for a given sample. Our approach, in contrast, yields predictions with increasing accuracy as the computation cycles through the experts' sequence. We train our model by iteratively sampling from a set of candidate experts, unfolding the sequence akin to how Recurrent Neural Networks are trained. Our method does not require load-balancing mechanisms, and preliminary experiments show a reduction in training epochs of 10\% to 40\% with a comparable/higher accuracy. These results point to new research directions in the field of MoEs, allowing the design of potentially faster and more expressive models. The code is available at https://github.com/nutig/RayTracing

machine learning, natural language, node, (20 more...)

2507.12419

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Pardo, Luis Miguel, Sebastián, Daniel

Erzeugunsgrad, VC-Dimension and Neural Networks with rational activation function

arXiv.org Artificial IntelligenceApr-16-2025

The notion of Erzeugungsgrad was introduced by Joos Heintz in 1983 to bound the number of non-empty cells occurring after a process of quantifier elimination. We extend this notion and the combinatorial bounds of Theorem 2 in Heintz (1983) using the degree for constructible sets defined in Pardo-Sebastián (2022). We show that the Erzeugungsgrad is the key ingredient to connect affine Intersection Theory over algebraically closed fields and the VC-Theory of Computational Learning Theory for families of classifiers given by parameterized families of constructible sets. In particular, we prove that the VC-dimension and the Krull dimension are linearly related up to logarithmic factors based on Intersection Theory. Using this relation, we study the density of correct test sequences in evasive varieties. We apply these ideas to analyze parameterized families of neural networks with rational activation function.

artificial intelligence, irreducible component, machine learning, (18 more...)

2504.11345

Country: Europe > Spain (0.46)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Huijzer, Anne-Men, Chaffey, Thomas, Besselink, Bart, van Waarde, Henk J.

Convergence of energy-based learning in linear resistive networks

arXiv.org Artificial IntelligenceFeb-28-2025

-- Energy-based learning algorithms are alternatives to backpropagation and are well-suited to distributed implementations in analog electronic devices. However, a rigorous theory of convergence is lacking. We make a first step in this direction by analysing a particular energy-based learning algorithm, Contrastive Learning, applied to a network of linear adjustable resistors. It is shown that, in this setup, Contrastive Learning is equivalent to projected gradient descent on a convex function, for any step size, giving a guarantee of convergence for the algorithm. Backpropagation is the most popular method of training artificial neural networks. However, while artificial neural networks are inspired by biological nervous systems, it has long been observed that backpropagation is not biologically plausible [1]-[3]. Several biologically plausible alternatives to backpropagation have been proposed in the literature, among them so-called energy-based learning algorithms [4]- [11]. These algorithms apply to energy-based models, which come equipped with some generalized notion of energy, and associate to each input a minimum of this energy. The basic idea is to probe the system in two states, one free and one clamped, or dictated by the training data, and use the energy difference between these states as a cost function. An iterative procedure is then applied to minimise this cost function. Several clamping mechanisms and iterative procedures have been defined, among them Contrastive Learning [4], [5], [12], Equilibrium Propagation [7], Coupled Learning [9] and Temporal Contrastive Learning [13]. These algorithms all resemble gradient descent, where the gradient of the cost function is replaced by a gradient-like quantity which may be computed in a distributed manner across a network. The energy-based learning paradigm is particularly suited to learning in analog electronic devices, as they have a natural notion of generalized energy: the heat dissipated by electrical resistance (in this case, a power rather than energy). M. A. Huijzer, B. Besselink, and H.J. van Waarde are with the Bernoulli Institute for Mathematics, Computer Science, and Artificial Intelligence, University of Groningen, Groningen, The Netherlands; email: m.a.huijzer@rug.nl; Chaffey was with the Control Group, Department of Engineering, University of Cambridge, UK, and is now with the School of Electrical and Computer Engineering, University of Sydney, Australia; email: thomas.chaffey@sydney.edu.au. This is, in part, due to the ability of analog circuits to perform inference many times faster than conventional neural networks [20]-[22].

algorithm, contrastive learning, convergence, (13 more...)